A new algorithm for mining frequent connected subgraphs based on adjacency matrices
نویسندگان
چکیده
Most of the Frequent Connected Subgraph Mining (FCSM) algorithms have been focused on detecting duplicate candidates using canonical form (CF) tests. CF tests have high computational complexity, which affects the efficiency of graph miners. In this paper, we introduce novel properties of the canonical adjacency matrices for reducing the number of CF tests in FCSM. Based on these properties, a new algorithm for frequent connected subgraph mining called grCAM is proposed. The experiments on real world datasets show the impact of the proposed properties in FCSM. Besides, the performance of our algorithm is compared against some other reported algorithms.
منابع مشابه
Using a Hash-Based Method for Apriori-Based Graph Mining
The problem of discovering frequent subgraphs of graph data can be solved by constructing a candidate set of subgraphs first, and then, identifying within this candidate set those subgraphs that meet the frequent subgraph requirement. In Apriori-based graph mining, to determine candidate subgraphs from a huge number of generated adjacency matrices is usually the dominating factor for the overal...
متن کاملA Closed Frequent Subgraph Mining Algorithm in Unique Edge Label Graphs
Problems such as closed frequent subset mining, itemset mining, and connected tree mining can be solved in a polynomial delay. However, the problem of mining closed frequent connected subgraphs is a problem that requires an exponential time. In this paper, we present ECE-CloseSG, an algorithm for finding closed frequent unique edge label subgraphs. ECE-CloseSG uses a search space pruning and ap...
متن کاملFrequent approximate subgraphs as features for graph-based image classification
The use of approximate graph matching for frequent subgraph mining has been identified in different applications as a need. To meet this need, several algorithms have been developed, but there are applications where it has not been used yet, for example image classification. In this paper, a new algorithm for mining frequent connected subgraphs over undirected and labeled graph collections VEAM...
متن کاملgSpan: Graph-Based Substructure Pattern Mining
We investigate new approaches for frequent graph-based pattern mining in graph datasets and propose a novel algorithm called gSpan (graph-based Substructure pattern mining), which discovers frequent substructures without candidate generation. gSpan builds a new lexicographic order among graphs, and maps each graph to a unique minimum DFS code as its canonical label. Based on this lexicographic ...
متن کاملLarge Scale Graph Representations for Subgraph Census
A Subgraph Census (determining the frequency of smaller subgraphs in a network) is an important computational task at the heart of several graph mining algorithms. Recently, several efficient algorithms have been described. We focus on the g-tries, a data structure that encapsulates the topology of the smaller subgraphs in order to speed up the overall computation. Its algorithm makes extensive...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Intell. Data Anal.
دوره 14 شماره
صفحات -
تاریخ انتشار 2010